aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption

نویسندگان

چکیده

Region visual features enhance the generative capability of machines based on features. However, they lack proper interaction-based attentional perceptions and end up with biased or uncorrelated sentences pieces misinformation. In this work, we propose Attribute Interaction-Tensor Product Representation (aiTPR), which is a convenient way gathering more information through orthogonal combination learning interactions as physical entities (tensors) improving captions. Compared to previous works, where add undefined feature spaces, TPR helps maintain sanity in combinations, orthogonality define familiar spaces. We have introduced new concept layer that defines objects their can play crucial role determining different descriptions. The interaction portions contributed heavily better caption quality out-performed various works domain MSCOCO dataset. For first time, notion combining regional image abstracted likelihood embedding for captioning.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning a Recurrent Visual Representation for Image Caption Generation

In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network. Unlike previous approaches that map both sentences and images to a common embedding, we enable the generation of novel sentences given an image. Using the same model, we can also reconstruct the visual features associated wi...

متن کامل

Multimodal Pivots for Image Caption Translation

We present an approach to improve statistical machine translation of image descriptions by multimodal pivots defined in visual space. Image similarity is computed by a convolutional neural network and incorporated into a target-side translation memory retrieval model where descriptions of most similar images are used to rerank translation outputs. Our approach does not depend on the availabilit...

متن کامل

Cross-Lingual Image Caption Generation

Automatically generating a natural language description of an image is a fundamental problem in artificial intelligence. This task involves both computer vision and natural language processing and is called “image caption generation.” Research on image caption generation has typically focused on taking in an image and generating a caption in English as existing image caption corpora are mostly ...

متن کامل

Topic-Specific Image Caption Generation

Recently, image caption which aims to generate a textual description for an image automatically has attracted researchers from various fields. Encouraging performance has been achieved by applying deep neural networks. Most of these works aim at generating a single caption which may be incomprehensive, especially for complex images. This paper proposes a topic-specific multi-caption generator, ...

متن کامل

A Note on Tensor Product of Graphs

Let $G$ and $H$ be graphs. The tensor product $Gotimes H$ of $G$ and $H$ has vertex set $V(Gotimes H)=V(G)times V(H)$ and edge set $E(Gotimes H)={(a,b)(c,d)| acin E(G):: and:: bdin E(H)}$. In this paper, some results on this product are obtained by which it is possible to compute the Wiener and Hyper Wiener indices of $K_n otimes G$.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Processing Letters

سال: 2021

ISSN: ['1573-773X', '1370-4621']

DOI: https://doi.org/10.1007/s11063-021-10438-5